Extended Seminar in Machine Learning and Data Mining
The seminar page can be found here in TUCaN.
Who, when and where?
The seminar will be held as block seminar on January 8th and 9th, in Room A126.
The group write-ups are due on January 29, 2018, the students' reviews of these write-ups on February 12, 2018.
The kick-off meeting was on Tuesday, October 24, 2017, 17:10h in C110.
The seminar will be jointly held by Profs. Carsten Binnig, Johannes Fürnkranz, and Kristian Kersting.
Instructions for the write-ups are now available.
Prerequisites
It is not necessary to have prior knowledge in artificial intelligence, but prior knowledge in data mining and machine learning is helpful. Participation is limited to 20 students. In case we have more students, students with prior knowledge in data mining and knowledge discovery will be preferred. The selection will be made at kick-off meeting.
For further questions feel free to send an email to ml-sem@ke.tu-darmstadt.de. No prior registration is needed, however, please stlll send us an email so that we are able to estimate beforehand the number of participants, and have your E-mail address for possible announcements. Also make sure that you are registered in TUCaN.
Content
This year's topic of the seminar is Interactive Machine Learning, i.e. machine learning algorithms that are meant to be used interactively or co-actively with a human user. We will concentrate on recent papers published in workshops, journals, and conferences. A list of topics is available below. The topics will be assigned based on an on-line bidding process, which will be opened after the kick-off. The final assignment will be made a week later.
Extended Seminar?
What is "Extended" about this seminar? Students are not only expected to give a short talk, but also to prepare a small write-up. The write-up will be prepared in groups, each group will cover one theme, consisting of four topics. The final write-up must be concise and short, and should give a short overview of the theme (not necessarily limited to the studied papers).
In addition, we will also do a peer reviewing process, as it is usually done at scientific conferences. This means that you also have to read (some) of the other write-ups and provide feedback by filling out a review form.
Because they are more work for students, students receive 4 CPs for Extended Seminars (instead of 3 CPs for regular seminars).
Talks
Although each topic is typically associated with a single paper, the point of the talk is not to exactly reproduce the entire contents of the paper, but to communicate the key ideas of the methods that are introduced in the paper. Thus, the content of the talk should exceed the scope of the paper, and demonstrate that a thorough understanding of the material was achieved. See also our general advices on giving talks.
Students are expected to give a 20 (!) minute talk on the material they are assigned, followed by 10 minutes of questions. Note that the comparably short period of time forces you to get the most important points of your topic across. You are not expected to present everything.
The talks are expected to be accompanied by slides. In case you do not own a laptop, please send us the slides in advance, so that we can prepare and test the slides. The talk and the slides should be in English.
Write-Up
The talks are organized in topical groups. Each group must prepare one short write-up of their work.
Content: The papers are related to each other. Your task is to use these papers to create a mini-survey that combines the results of all papers, and possibly other papers. The contribution of each individual paper can be limited to the most important points that are contributed by this paper to the topic. There must be a clear "red thread" within each survey, a concatenation of individual paper summaries is not enough. A possible outline can consist of an introduction to set the stage and outline the cross-cutting themes of all papers, multiple sections on individual contributions w.r.t. cross-cutting themes and comparison of different approaches, a joined related work section, and a summary and outlook.
Format: The format for the write-up is predefined, and follows conventions that are typically used for publications in computer science. In particular, we require each paper to be formatted according to the Template for Contributions in IEEE Transactions. Each paper should have no more than 5 pages in this format (the bibliography is not counted, and can be as long as necessary). The format must not be changed in order to generate more space. Each paper also must, of course, have a title, authors, and an abstract. The templates are available in Word and LaTeX, but we strongly recommend that you try to use LaTeX. Environments such as MiKTeX and TeXstudio make local LaTeX-editing quite easy, and web-sites like Overleaf offer collaborative working environments for LaTeX.
Deadline: The write-ups are due on January 29, 2018.
Reviewing
Reviewing assignments will be made in January based on your bids. A reviewing form will be provided by then. The deadline of the students' reviews will be February 12, 2018.
Topics and Schedule
All papers should be available on the internet or in the ULB. Note that Springer link often only works on campus networks (sometimes not even via VPN). If you cannot find a paper, contact us.
Session 1: Active Learning (Jan. 8th, 13:00h)
- Luna A.:
Jason Baldridge, Miles Osborne: Active learning and logarithmic opinion pools for HPSG parse selection. Natural Language Engineering 14(2): 191-222 (2008) - Jakob W.:
Sudheendra Vijayanarasimhan, Kristen Grauman: Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds. International Journal of Computer Vision 108(1-2): 97-114 (2014) - Herrmann L.:
Zhou S, Chen Q, Wang X (2014) Active Semi-Supervised Learning Method with Hybrid Deep Belief Networks. PLoS ONE9(9): e107122. - Felix H.:
Jamieson, K.G., Jain, L., Fernandez, C., Glattard, N.J., Nowak, R.: NEXT: A System for Real-world Development, Evaluation, and Application of Active Learning. In: Cortes, C., Lawrence, N., Lee, D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems 28 (NIPS). pp. 2638–2646 (2015)
Session 2: Hyperparameter Optimization (Jan 8th, 15:30h)
- Jia H.:
Lindauer, M.T., Hoos, H.H., Hutter, F., Schaub, T.: AutoFolio: An Automatically Configured Algorithm Selector. Journal of Artificial Intelligence Research 53, 745–778 (2015) - Yantao S.:
Lars Kotthoff, Chris Thornton, Holger H. Hoos, Frank Hutter, Kevin Leyton-Brown: Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research 18: 25:1-25:5 (2017).
Aaron Klein, Stefan Falkner, Simon Bartels, Philipp Hennig, Frank Hutter:Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets. AISTATS 2017: 528-536 - Yimin X.:
Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Timothy P. Lillicrap, Matthew Botvinick, Nando de Freitas: Learning to Learn without Gradient Descent by Gradient Descent. ICML 2017: 748-756. - Jianyang T.:
Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, Ameet Talwalkar, Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization (only on Arxiv so far)
Session 3: Interactive Machine Learning (Jan 9th, 10:00h)
- Matthias B.:
Dzyuba, V., van Leeuwen, M., Nijssen, S., De Raedt, L.: Interactive Learning of Pattern Rankings. International Journal of Artificial Intelligence Tools 23(6) (2014) - Maximilian O.:
Pannaga Shivaswamy, Thorsten Joachims: Coactive Learning. Journal of Artificial Intelligence Research 53: 1-40 (2015) - Maciej M.:
Odom, P., Natarajan, S.: Actively interacting with experts: A probabilistic logic approach. In: ECML PKDD 2016, Lecture Notes in Computer Science, vol. 9852, pp. 527–542. Springer, Cham (2016) - Sascha S.:
Stefano Teso, Paolo Dragone, Andrea Passerini: Coactive Critiquing: Elicitation of Preferences and Features. AAAI 2017: 2639-2645
Session 4: Interaction with Expert Knowledge (Jan 9th, 13:30h)
- Alexander L.:
Hu, Z., Ma, X., Liu, Z., Hovy, E., Xing, E.: Harnessing Deep Neural Networks with Logic Rules. In: 54th Annual Meeting of the Association for Computational Linguistics (ACL). pp. 2410–2420. ACL (2016) - Erich W.:
Stefano Teso, Roberto Sebastiani, Andrea Passerini: Structured learning modulo theories. Artificial Intelligence 244: 166-187 (2017) - Leon Ch.:
Marco Túlio Ribeiro, Sameer Singh, Carlos Guestrin: "Why Should I Trust You?": Explaining the Predictions of Any Classifier. Proceedings KDD 2016: 1135-1144 - Emil A.:
Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., Elhadad, N.: Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 10-13, 2015. pp. 1721–1730 (2015)
Bidding
Papers were distributed via bidding. Students could bid for papers using this form. All students received a paper in one of the two most preferred categories. A better choice was often not possible (e.g., if you bid only for a single paper and set all others to "Don't Want It", it is unlikely that you received your paper).